98 research outputs found

    High performance HEVC and FVC video compression hardware designs

    Get PDF
    High Efficiency Video Coding (HEVC) is the current state-of-the-art video compression standard developed by Joint collaborative team on video coding (JCT-VC). HEVC has 50% better compression efficiency than H.264 which is the previous video compression standard. HEVC achieves this video compression efficiency by significantly increasing the computational complexity. Therefore, in this thesis, we proposed a low complexity HEVC sub-pixel motion estimation (SPME) technique for SPME in HEVC encoder. We designed and implemented a high performance HEVC SPME hardware implementing the proposed technique. We also designed and implemented an HEVC fractional interpolation hardware using memory based constant multiplication technique for both HEVC encoder and decoder. Future Video Coding (FVC) is a new international video compression standard which is currently being developed by JCT-VC. FVC offers much better compression efficiency than the state-of-the-art HEVC video compression standard at the expense of much higher computational complexity. In this thesis, we designed and implemented three different high performance FVC 2D transform hardware. The proposed hardware is verified to work correctly on an FPGA board

    An FPGA implementation of future video coding 2D transform

    Get PDF

    An HEVC fractional interpolation hardware using memory based constant multiplication

    Get PDF
    Fractional interpolation is one of the most computationally intensive parts of High Efficiency Video Coding (HEVC) video encoder and decoder. In this paper, an HEVC fractional interpolation hardware using memory based constant multiplication is proposed. The proposed hardware uses memory based constant multiplication technique for implementing multiplication with constant coefficients. The proposed memory based constant multiplication hardware stores pre-computed products of an input pixel with multiple constant coefficients in memory. Several optimizations are proposed to reduce memory size. The proposed HEVC fractional interpolation hardware, in the worst case, can process 35 quad full HD (3840×2160) video frames per second. It has up to 31% less energy consumption than original HEVC fractional interpolation hardware

    Assessment of Inpatients in Trakya University Hospital in Terms of Informed Consent Indicated by Ministry of Health’S Patient Rights Regulations

    Get PDF
    DergiPark: 379044tmsjAims: The informed consent constitutes the legal validity of any medical intervention and treatment, and it provides the patient with information about the procedure. In this study, we aimed to assess the extent informed consent is being applied in Trakya University Hospital in accordance to Ministry of Health’s regulations.Methods: A data collecting form has been prepared with respect to Ministry of Health’s relevant regulations and it was applied to 78 inpatients. The form consists of 18 two-point statements and 4 questions towards the patient’s demographic profile. As for descriptive statistics, numbers and percentages have been used for the statements, thus arithmetic mean ± standard deviation (minimum-maximum) has been used for the questions, respectively.Results: Out of all included patients, 47 (60.3 %) participants did not have any knowledge about the Informed Consent Form, 24 (30.8%) did not sign the Form, 48 (61.5%) participants did not read the Form and 49 (62.8%) did not understand the Form.Conclusion: From the results of this study, it can be inferred that the extent patients are being informed is still not on intended levels and is not satisfactory. Informed Consent is still a topic to emphasize, improve and put into practice more effectivel

    Design and implementation of a fast and scalable NTT-based polynomial multiplier architecture

    Get PDF
    In this paper, we present an optimized FPGA implementation of a novel, fast and highly parallelized NTT-based polynomial multiplier architecture, which proves to be effective as an accelerator for lattice-based homomorphic cryptographic schemes. As I/O operations are as time-consuming as NTT operations during homomorphic computations in a host processor/accelerator setting, instead of achieving the fastest NTT implementation possible on the target FPGA, we focus on a balanced time performance between the NTT and I/O operations. Even with this goal, we achieved the fastest NTT implementation in literature, to the best of our knowledge. For proof of concept, we utilize our architecture in a framework for Fan-Vercauteren (FV) homomorphic encryption scheme, utilizing a hardware/software co-design approach, in which polynomial multiplication operations are offloaded to the accelerator via PCIe bus while the rest of operations in the FV scheme are executed in software running on an off-the-shelf desktop computer. Specifically, our framework is optimized to accelerate Simple Encrypted Arithmetic Library (SEAL), developed by the Cryptography Research Group at Microsoft Research, for the FV encryption scheme, where large degree polynomial multiplications are utilized extensively. The hardware part of the proposed framework targets Xilinx Virtex-7 FPGA device and the proposed framework achieves almost 11x latency speedup for the offloaded operations compared to their pure software implementations

    Exploring RNS for Isogeny-based Cryptography

    Get PDF
    Isogeny-based cryptography suffers from a long-running time due to its requirement of a great amount of large integer arithmetic. The Residue Number System (RNS) can compensate for that drawback by making computation more efficient via parallelism. However, performing a modular reduction by a large prime which is not part of the RNS base is very expensive. In this paper, we propose a new fast and efficient modular reduction algorithm using RNS. Also, we evaluate our modular reduction method by realizing a cryptoprocessor for isogeny-based SIDH key exchange. On a Xilinx Ultrascale+ FPGA, the proposed cryptoprocessor consumes 151,009 LUTs, 143,171 FFs and 1,056 DSPs. It achieves 250 MHz clock frequency and finishes the key exchange for SIDH in 3.8 and 4.9 ms

    PROTEUS: A Tool to generate pipelined Number Theoretic Transform Architectures for FHE and ZKP applications

    Get PDF
    Emerging cryptographic algorithms such as fully homomorphic encryption (FHE) and zero-knowledge proof (ZKP) perform arithmetic involving very large polynomials. One fundamental and time-consuming polynomial operation is the Number theoretic transform (NTT) which is a generalization of the fast Fourier transform. Hardware platforms such as FPGAs could be used to accelerate the NTTs in FHE and ZKP protocols. One major problem is that the FHE and ZKP protocols require different parameter sets, e.g., polynomial degree and coefficient size, depending on their applications. Therefore, a basic research question is: How to design scalable hardware architectures for accelerating NTTs in the FHE and ZKP protocols? In this paper, we present ‘PROTEUS’, an open-source and parametric tool that generates synthesizable bandwidth-efficient NTT architectures for user-specified parameter sets. The architectures can be tuned to utilize different memory bandwidths and parameters which is a very important design requirement in both FHE and ZKP protocols. The generated NTT architectures show a significant performance speedup compared to similar NTT architectures on FPGA. Further comparisons with state-of-the-art show a reduction of up to 23% and 35% in terms of DSP and BRAM utilization

    Small MACs from Small Permutations

    Get PDF
    The concept of lightweight cryptography has gained in popularity recently, also due to various competitions and standardization efforts specifically targeting more efficient algorithms, which are also easier to implement. One of the important properties of lightweight constructions is the area of a hardware implementation, or in other words, the size of the implementation in a particular environment. Reducing the area usually has multiple advantages like decreased production cost or lower power consumption. In this paper, we focus on MAC functions and on ASIC implementations in hardware, and our goal is to minimize the area requirements in this setting. For this purpose, we design a new MAC scheme based on the well-known Pelican MAC function. However, in an effort to reduce the size of the implementation, we make use of smaller internal permutations. While this certainly leads to a higher internal collision probability, effectively reducing the allowed data, we show that the full security is still maintained with respect to other attacks, in particular forgery and key recovery attacks. This is useful in scenarios which do not require large amounts of data. Our detailed estimates, comparisons, and concrete benchmark results show that our new MAC scheme has the lowest area requirements and offers competitive performance. Indeed, we observe an area advantage of up to 30% in our estimated comparisons, and an advantage of around 13% compared to the closest competitor in a concrete implementation

    Efficient multiple constant multiplication using DSP blocks in FPGA

    Get PDF
    Multiple constant multiplication (MCM) operation multiplies an input variable with multiple constants. MCM operations are widely used in many applications such as video processing and compression. In this paper, a method is proposed for efficient implementation of MCM operations using DSP blocks in Xilinx FPGAs. The proposed method reduces number of DSP blocks used for implementing a given MCM operation by manipulating the multiple constants used in this MCM operation. In this paper, a high level synthesis tool implementing the proposed method is also proposed. The proposed tool takes the input variable bit length and multiple constants as inputs, and generates a Verilog RTL code which efficiently implements this MCM operation using DSP blocks. The proposed method and tool are used for one of the most complex video compression algorithms, HEVC 2D DCT. They reduced number of DSP blocks used in the FPGA implementation of HEVC 2D DCT algorithm by 35.8%
    corecore